O. Scrivner, T. Gilmanov SWIFT ALIGNER: A TOOL FOR THE VISUALIZATION AND CORRECTION OF WORD ALIGNMENT AND FOR CROSS LANGUAGE TRANSFER
نویسندگان
چکیده
It is well known that parallel corpora are valuable linguistic resources. One of the benefits of such corpora is that they allow for the building an annotated corpus for resource-poor languages via crosslanguage transfer. That is, given accurate alignment between a word from a source language and its equivalent in a target language, some linguistic information, such as part-of-speech tags or syntactic annotation, can be projected to the aligned word. While there are several state-of-the-art word-aligners, such as GIZA++ and Berkeley, there is no simple visual tool that would enable correcting and editing aligned corpora of different formats. We have developed Swift Aligner, a free portable software that facilitates the visual representation corpora, the correction of alignment and finally the transfer of morphological information and syntactic relations from an annotated source language into an unannotated target language, by means of word-alignment. In addition, this tool is flexible, as it imports corpora in various formats, such as GIZA++, Berkeley, and LIHLA. Finally, we have also shown that by using cross-language transfer, we would need only an estimated 30% of correction by human annotator, compared to 100% of manual annotation.
منابع مشابه
Integrating Quality Control Tests in a Computed Tomography System
Quality control (QC) is of primary importance for computed tomography systems to obtain reliable data for Non Destructive Testing (NDT). A QC protocol is pursued for the AIMEN dual detector CT by a software add-on update on the reconstruction and visualization software. Image quality parameters are considered following standards and other published papers: uniformity, noise, SNR, contrast, pixe...
متن کاملTowards Musicdiff: A Foundation for Improved Optical Music Recognition Using Multiple Recognizers
This paper presents work towards a “musicdiff” program for comparing files representing different versions of the same piece, primarily in the context of comparing versions produced by different optical music recognition (OMR) programs. Previous work by the current authors and others strongly suggests that using multiple recognizers will make it possible to improve OMR accuracy substantially. T...
متن کاملOptimized co-registration method of Spinal cord MR Neuroimaging data analysis and application for generating multi-parameter maps
Introduction: The purpose of multimodal and co-registration In MR Neuroimaging is to fuse two or more sets images (T1, T2, fMRI, DTI, pMRI, …) for combining the different information into a composite correlated data set in order to visualization, re-alignment and generating transform to functional Matrix. Multimodal registration and motion correction in spinal cord MR Neuroimag...
متن کاملCompensation of geometric distortion effects on intraoperative magnetic resonance imaging for enhanced visualization in image-guided neurosurgery.
OBJECTIVE Preoperative magnetic resonance imaging (MRI), functional MRI, diffusion tensor MRI, magnetic resonance spectroscopy, and positron-emission tomographic scans may be aligned to intraoperative MRI to enhance visualization and navigation during image-guided neurosurgery. However, several effects (both machine- and patient-induced distortions) lead to significant geometric distortion of i...
متن کاملIsometric Correction for Manifold Learning
In this paper, we present a method for isometric correction of manifold learning techniques. We first present an isometric nonlinear dimension reduction method. Our proposed method overcomes the issues associated with well-known isometric embedding techniques such as ISOMAP and maximum variance unfolding (MVU), i.e., computational complexity and the geodesic convexity requirement. Based on the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013